High Quality Arabic Lexical Ontology Based on MUHIT, WordNet, SUMO and DBpedia

نویسندگان

  • Eslam Kamal
  • Mohsen Rashwan
  • Sameh Alansary
چکیده

In this paper, we aim to move ontology-based Arabic NLP forward by experimenting with the generation of a comprehensive Arabic lexical ontology using multiple language resources. We recommend a combination of MUHIT, WordNet and SUMO and use a simple method to link them, which results in the generation of an Arabic-lexicalized version of the SUMO ontology. Then, we evaluate the generated ontology, and propose a method for increasing its named entity coverage using DBpedia, English-to-Arabic Transliteration, and Named Entity Recognition. We end up with an Arabic lexical ontology that has 228K Arabic synsets, linked to 7.8K concepts and 143K instances. This ontology achieves a precision of 96.9% and recall of 75.5% for NLU scenarios.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Paradigmatic Morphology and Subjectivity Mark-Up in the RoWordNet Lexical Ontology

Lexical ontologies are fundamental resources for any linguistic application with wide coverage. The reference lexical ontology is the ensemble made of Princeton WordNet, a huge semantic network, and SUMO&MILO ontology, the concepts of which are labelling each synonymic series of Princeton WordNet. This lexical ontology was developed for English language, but currently there are more than 50 sim...

متن کامل

Sinica BOW (Bilingual Ontological Wordnet): Integration of Bilingual WordNet and SUMO

The Academia Sinica Bilingual Ontological Wordnet (Sinica BOW) integrates three resources: WordNet, English-Chinese Translation Equivalents Database (ECTED), and SUMO (Suggested Upper Merged Ontology). The three resources were originally linked in two pairs: WordNet 1.6 was manually mapped to SUMO (Niles & Pease 2003) and also to ECTED (the English lemmas in WordNet were mapped to their Chinese...

متن کامل

Adjectives in the Dutch Semantic Lexical Database CORNETTO

The goal of this paper is to describe how adjectives are encoded in Cornetto, a semantic lexical database for Dutch. Cornetto combines two existing lexical resources with different semantic organisation, i.e. Dutch Wordnet (DWN) with a synset organisation and Referentie Bestand Nederlands (RBN) with an organisation in Lexical Units. Both resources will be aligned and mapped on the formal ontolo...

متن کامل

Linking Lixicons and Ontologies: Mapping WordNet to the Suggested Upper Merged Ontology

Abstract: Ontologies are becoming extremely useful tools for sophisticated software engineering. Designing applications, databases, and knowledge bases with reference to a common ontology can mean shorter development cycles, easier and faster integration with other software and content, and a more scalable product. Although ontologies are a very promising solution to some of the most pressing p...

متن کامل

A framework for constructing cognition ontologies using WordNet, FrameNet, and SUMO

11 Psychoinformatics is an emerging discipline that uses tools from the information sciences to organize psychological data. This article 12 supports that objective by proposing a framework for constructing cognition ontologies by using WordNet, FrameNet, and the Sug13 gested Upper Merged Ontology (SUMO). The first section describes the major characteristics of each of these tools. WordNet is a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015